Nonparametric priors for finite unknown cardinalities of sampling spaces
نویسنده
چکیده
The Dirichlet process and its popular generalization, the Pitman-Yor process, are often considered as priors in the context of multinomial sampling. They permit inferences on discrete sampling spaces of infinite cardinality. In fact, they a priori assume that there are infinitely many di erent outcomes to be observed, but rule out that there are only finitely many things. This, for instance, limits their usage in species sampling problems, where among other things we have to infer an unknown, but finite cardinality. Following first principles of inductive inference, such as exchangeability, we characterize a new class of nonparametric priors that extends the Pitman-Yor process, and permits the elicitation of a posterior distribution for the cardinality of the sample space, as well as the derivation of non-degenerate probabilities for any number of novel outcomes to appear given a finite sample. Over the course of the last decade, the Dirichlet process, and even more its generalization into the Pitman-Yor process, have met an indisputable success in the field of discrete Bayesian nonparametrics. One obvious explanation of this success lies in the analytical tractability of the inference procedures they support in settings where there is an infinite sample space to be considered. This benefit is certaintly best exemplified by the multiple uses of these processes as priors in clustering problems where the number of clusters is unknown, although applications where these processes drive the generation of the observations themselves have also gained a wide popularity. While meant to tackle very di erent problems, these applications share one common feature: only properties pertaining to finite samples are queried. In the context of clustering, one is indeed typically interested in the posterior distribution of the random partition of the observations into clusters; in the species sampling problem, the focus is rather on predicting the next outcome in a manner consistent with the possibility that an outcome unobserved so far might appear. Nonetheless, the inference of universal properties of the sample space, such as how many clusters would one encounter if one were to keep sampling indefinitely many times, or how many species are there in total, is usually avoided. This bias does not really follow from a lack of scientific interest in the answers to these questions, but rather from the full support provided by those processes to the universal hypothesis that infinitely many di erent outcomes would ultimately occur. In other words, any hypothesis set-
منابع مشابه
Nonparametric Bayesian Methods
. Most of this book emphasizes frequentist methods, especially for nonparametric problems. However, there are Bayesian approaches to many nonparametric problems. In this chapter we present some of the most commonly used nonparametric Bayesian methods. These methods place priors on infinite dimensional spaces. The priors are based on certain stochastic processes called Dirichlet processes and Ga...
متن کاملBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
In Probability proportional to size (PPS) sampling, the sizes for nonsampled units are not required for the usual Horvitz-Thompson or Hajek estimates, and this information is rarely included in public use data files. Previous studies have shown that incorporating information on the sizes of the nonsampled units through semiparamteric models can result in improved estimates. When the design vari...
متن کاملPosterior Consistency of Species Sampling Priors
Recently there has been increasing interest in species sampling priors, the nonparametric priors defined as the directing random probability measures of the species sampling sequences. In this paper, we show that not all of the species sampling priors produce consistent posteriors. In particular, in the class of Pitman-Yor process priors, the only priors rendering posterior consistency are esse...
متن کاملApproximate Dirichlet Process Computing in Finite Normal Mixtures: Smoothing and Prior Information
A rich nonparametric analysis of the finite normal mixture model is obtained by working with a precise truncation approximation of the Dirichlet process. Model fitting is carried out by a simple Gibbs sampling algorithm that directly samples the nonparametric posterior. The proposed sampler mixes well, requires no tuning parameters, and involves only draws from simple distributions, including t...
متن کاملNonparametric Priors on Complete Separable Metric Spaces
A Bayesian model is nonparametric if its parameter space has infinite dimension; typical choices are spaces of discrete measures and Hilbert spaces. We consider the construction of nonparametric priors when the parameter takes values in a more general functional space. We (i) give a Prokhorov-type representation result for nonparametric Bayesian models; (ii) show how certain tractability proper...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011